From Micro- to Macro-processing: A Generic Data Management Model
نویسندگان
چکیده
Rapid progress in distributed computing have provided collaborative studies and essential compute power for science, and made data administration a critical challenge in this area. Scientific applications have become more data intensive; moreover, data management happens to be more demanding than computational requirements in terms of needed resources. There have been many recent studies investigating new approaches for data management and data transfer in distributed systems. On the other hand, data management has been one of the crucial problems in every stage of computer engineering. Accessing data in a transparent and efficient manner is also a major issue both in operating system design and in microprocessor architecture. Data management and data access approaches in operating system level and processor design can be applicable to distributed systems. We analyze data access and data management issues in different layers of computer systems starting from microprocessor level, extending to operating system level, and then, investigating data handling in distributed systems. We define characteristics of each layer in terms of similarities and differences in storage operations. We study problems and challenges in data access and data storage, and we investigate applied techniques in small and large scale systems. Studying and analyzing in different scales will enable us to better understand the data management problem in distributed systems. In distributed systems, different service layers have been defined and overall load is shared between separate components. Therefore, there are different software and hardware elements serving for specific purposes to maintain the overall system. Basic components of a distributed system are computing and storage resources which are reachable to each other over network connections. Specialized data servers and data storage sites have been formed due to storage requirements of today’s applications. Necessitated input data is either transferred and stored in a temporary space or accessed remotely over the network. Due to the nature of interconnects between distributed elements there are many factors affecting performance such as latency and bandwidth. This leads to staging and caching services in order to make data easily and quickly accessible. In recent past, microprocessors have been one of the fastest growing technologies. In the early models, accessing data was one of the major problems as it is still a crucial issue today; such that, data retrieval from memory to execution unit is slow due to the speed of memory and also the latency in the bus between memory and CPU. In order to overcome the speed gap between arithmetic operations and memory access operations, many techniques have been applied. Scheduling the execution of instructions starts with out-of-order execution technique used in early x86 designs and extends to dynamic scheduling algorithms in which operations are buffered in the issue phase before they are sent to execution units. Moreover, pre-fetch registers, multi-level caches, complex branch prediction algorithms are some of the methodologies used in processor design. Execution stages are split into execution units and also into multiple pipelines in order to enhance performance using pipelining and superscalar design. Load-store operations are separated from other execution units and I/O instructions are scheduled and executed by specialized units in microprocessors. Besides, many approaches have been used in operating system level to access data in an efficient manner. DMA, I/O scheduling, file systems, disk management, simultaneous resource sharing, and lowlevel parallelism such as disk striping are some common examples. The I/O bottleneck has also been studied in parallel computer systems and many interesting techniques have been proposed and applied successfully. Data handling has been dealt with separate components and special action has been required throughout the evolution of computer systems. We first start with microprocessors, the basic component of computer systems, and study data management in processor architecture. We focus on problems on accessing and storing data that have been encountered in the early phases of processor design, and also analyze how those issues have been resolved to enable today’s fast and efficient microprocessors. Then, we study disk scheduling and I/O scheduling techniques in operating systems for single or multiprocessor environments. We mainly concentrate on mapping traditional operating system concepts to widely distributed environments in terms of scheduling and I/O management. Since small and large scale systems have similar problems, we are planning to utilize approaches in computer and operating system kernel architecture to come up with a broader perspective in which we can extend known methodologies to be used in distributed systems.
منابع مشابه
Generic Plan of Food Safety Management System Based on ISO 22000:2005 for Aflatoxin Control in Raw Pistachio Processing Units from Raw Material Reception to Packaging
Pistachio is one of the most important agricultural crops of Iran. It is a nut from Anacardiacea family and its domesticated species is called Pistacia vera L. Regarding to pistachio importance and usage and by the expanding of pistachio cultivate, it is necessary to improve agricultural situation and by establishing well equipped processing and packaging units near the farms, it is possible to...
متن کاملAn Investigation of the Generic Features of Research Articles Published in the Bulletin of Iranian Mathematical Society
In light of the understanding that the analysis of the generic features of different academic genres can enhance the ability of non-native members of academic discourse communities to understand, and where appropriate, to produce them, the present study aimed at investigating the dominant generic structure of research articles in mathematics. To start with a relatively narrow focus, a corpus of...
متن کاملAnalysis of Economic Determinants of Fertility in Iran: A Multilevel Approach
Background During the last three decades, the Total Fertility Rate (TFR) in Iran has fallen considerably; from 6.5 per woman in 1983 to 1.89 in 2010. This paper analyzes the extent to which economic determinants at the micro and macro levels are associated with the number of children in Iranian households. Methods Household data from the 2010 Household Expenditure and Income Survey (HEIS) is ...
متن کاملNavigating the Hindrances Arising at Macro and Micro-level from Practicality of Transformative Pedagogy
The present study intends to probe the impediments for the practicality of Critical language pedagogy (CLP) in higher education system of Iran. To do this, 20 Iranian university instructors, holding Ph.D. degrees in TEFL, were asked to read a passage reflecting the main characteristics of transformative pedagogy. To explore the main obstacles, they were invited for a semi-structured interview. ...
متن کاملA Macro-model for Nonlinear Analysis of 3D Reinforced Concrete Shear Walls
Architectural limitations in many situations make it necessary for the RC shear walls to be extended in plan in different directions at a single location that makes them a 3D configuration. Analysis of such walls is very challenging. In this research about 450 cases of 3D shear walls are considered with different shapes and heights. L, T and H-shape walls are studied. They are nonlinearly analy...
متن کاملMeasuring Performance, Estimating Most Productive Scale Size, and Benchmarking of Hospitals Using DEA Approach: A Case Study in Iran
Background and Objectives: The goal of current study is to evaluate the performance of hospitals and their departments. This manuscript aimed at estimation of the most productive scale size (MPSS), returns to scale (RTS), and benchmarking for inefficient hospitals and their departments. Methods: The radial and non-radial data envelopment analysis (DEA) ap...
متن کامل